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In these lecture notes I will discuss the universal first-passage properties of a simple correlated 
discrete-time sequence {a;o = 0, a;i, a;2, . • . , a;n} up to n steps where Xi represents the position at 
1*"^ ■ step j of a random walker hopping on a continuous line by drawing independently, at each time 

' step, a random jump length from an arbitrary symmetric and continuous distribution (it includes, 

, e.g., the Levy flights). I will focus on the statistics of two extreme observables associated with the 

sequence: (i) its global maximum and the time step at which the maximum occurs and (ii) the 
^ ' number of records in the sequence and their ages. I will demonstrate how the universal statistics of 

, these observables emerge as a consequence of PoUaczek-Spitzer formula and the associated Sparre 

■ Andersen theorem. 

m : 

I ^! I. INTRODUCTION 

^ ; 

Since the remarkable founding paper by Einstein in 1905 followed closely by two seminal papers respectively 
by Smoluchowski 2] and Langevin [3], random walks and the associated continuous-time Brownian motion have 
remained as fundamental cornerstones of statistical physics with an amazingly impressive number of applications 0- 
[§] that range from traditional 'natural' sciences such as physics, chemistry, biology, mathematics and astronomy all 
the way to 'man-made' subjects such as computer science and finance. Even though many aspects of this classical 

■ subject are extremely well understood and form text book materials, it is fascinating that new questions with non 
[ trivial answers, arising from new applications, continue to spring unexpected surprises. 

a In these lectures I will discuss some of these recent applications. The general area of random walks is vast with an 
I ' enormous literature. My goal for these lectures is rather modest. I will just focus on a rather simple and restricted 
model: a discrete-time random hopper on a continuous line. Starting from the origin xq = 0, the position of the 
P5 ' particle at step n evolves via the Markov rule, Xn — Xn-i +£,n, where ^„'s denote the random jumps at different time 
O . steps. These jumps are independent and identically distributed (i.i.d.) random variables, each drawn from the same 
I distribution which is symmetric and continuous. If the walk evolves up to step n, one generates a sequence or a 

discrete-time series: {xq = 0,xi,X2, . ■ . ,Xn}- Clearly the members of this sequence are correlated random variables. 
' Such a sequence is perhaps the simplest possible correlated sequence that appears rather naturally in many different 
^ [ contexts. A classic example of such a walk can be found in bacterial chemotaxis, where a bacteria, in search of food, 
t jumps from one position to another at discrete time steps [l^ . In the context of queueing theory Xn represents the 
length of a single queue at time n [l^l . In the context of the evolution of stock prices Xn represents the logarithm of 
, the price of a stock at time n 11 1 . It can also represent the x coordinates of the beads of a Rouse polymer chain in 
• • thermal equilibrium in d-dimensions (when the jump distribution is Gaussian) [l^ (see also [l3|)- When the jump 
l distribution has a power law tail with a divergent second moment, ^ ICI""'^"'^ (with < /i < 2) for large |^|, this 
Q,^ • sequence represents a Levy flight which also has enormous number of applications [l5l - fl9| . 

I Here I will focus on two extreme observables associated with such a correlated sequence: (i) the global maximum 
• • r Mn = max{0, xi, X2, . . . , Xn} of the sequence and the associated time step m at which this maximum is realized in a 
. !^ given sample (ii) the number and ages of records of this sequence where a record is set to happen at step i if Xi is 
, bigger than all the previous values: Xi > a;^ for all < fc < i. Age of a record is simply the number of steps up to 
}^ ■ which this record survives, i.e, till it gets surpassed by the next record breaking event. 

5^ , Now, if the number of steps n of the sequence is large and if the second moment of the jump length distribution 
— is finite, one would expect, correctly, to recover the continuous-time limit results of the Brownian 

motion as a consequence of the central limit theorem, at least for the global maximum ( records are not very well 
defined in the continuous-time limit). However, it turns out, as I will show here in some detail, that many properties 
associated with extreme events such as the global maximum or the number of records are completely universal for all 
n, i.e., they do not depend on the jump length distribution (f)(^) at all whatever be the value of n, as long as 0(^) is 
symmetric and continuous. Note, in particular, that this universality does not even require a finite cr^, e.g., it holds 
even for long range Levy flights. 

In fact, this universality has nothing to do with the central limit theorem. Instead, it will turn out to be a 
consequence of the Sparre Andersen theorem [20| concerning the first-passage properties of such a random walk 
sequence [1, [2l| . This is a rather deep combinatorial theorem and the final result looks deceptively simple though its 
derivation is far from simple. Here I will provide a derivation of this result using another result on the generating 
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function of the maximum of such a sequence, known as the Pollaczek-Spitzer formula [22, l23j . Somehow these results 
are not so well known among physicists. So, I'll discuss these results in some detail and use them to derive some 
universal and some nonuniversal properties associated with the statistics of the maximum and the records of this 
random walk sequence. 

In the latter half of my lectures in the school, I also discussed the statistical properties of the functionals of Brownian 
motion via the Feynman-Kac formula and in particular, various interesting applications of the so called first-passage 
Brownian functionals, where one considers the Brownian motion till its first-passage time. They turn out to have 
various applications: in queueing theory, in finance, in simple models of particle moving in a disordered random 
potential and in astrophysics where one is interested in the distribution of the life time of a comet in the solar system. 
However, I will not include this interesting topic in these lecture notes, as I have already discussed it in another 
article [24|. The interested readers may consult this article and also another review on Brownian functionals with 
interesting applications in the localisation theory psj . 

This article is organised as follows. In Section II, I define the model precisely and review some basic preliminaries 
to remind the readers the central limit theorem and the Levy stable laws. In Section III, I discuss the first-passage 
properties associated with the random walk sequence and discuss the Pollaczek-Spitzer formula and how this formula 
leads to the Sparre Andersen theorem. Section IV is devoted to the statistics of the global maximum and the 
universal statistics of the time of its occurrence where we use the Sparre Andersen theorem. In Section V, we discuss 
the statistics of the number of records and their ages and show how universal properties emerge again as a consequence 
of the Sparre Andersen theorem. Finally, I conclude in Section VI with a summary and some open problems. 

II. RANDOM WALKS, BROWNIAN MOTION, LEVY FLIGHTS: SOME PRELIMINARIES 

A. Definitions 

Let us start with a simple discrete-time random walker moving on a continuous line. The position a;„ of the walker 
after n steps evolves for n > 1 via, 

'^n — -^n — 1 ~t- (1) 

starting at a;o — 0, where the step lengths ^„'s are i.i.d. random variables with zero mean and each drawn from a 
normalized (to unity) distribution which is symmetric, ^(^) = 4'i~£,) (see Fig. 1). 

Few examples of the jump length distribution 4>{S,) are: 

(i) = ie-l^l (Exponential) 

(ii) = e-«'/2<To (Gaussian) 

(iii) m - I m + 1) - - 1)] (Uniform) 

(iv) ^ |Cr^~^ for large \^\ with < ^ < 2 such that cr^ = J ^'^ does not exist (Levy fiights) 
(■^) 4'iO — + 1) + 5'5(C ~ 1) (Lattice random walk where the lattice spacing is unity). 

In first 4 of these examples, the cumulative jump distribution ^(x) — </'(0 "^C ^ continuous function. In the 
last example (v), where the walker is restricted to move on a one dimensional lattice with unit lattice spacing, the 
cumulative jump distribution 5'(^) is a non-continuous function. We will see later that this continuity property of 
^(^) will play an important role. Note further that in examples (i)-(iii) and (v), the variance of the step length, 
"■^ = J^ao ^"^^iO^i is finite. We will see that in such cases the central limit theorem holds. In the Levy case (iv), the 
central limit theorem breaks down. 

The evolution equation ([T]) is Markovian since the position x„ at step n depends only on the position at just the 
previous time step Xn-i (and not on the full history before the {n — l)-th step) and on the current noise, i.e., the noise 
^„ at step n. This Markovian property makes life simple as we will see later. As a simple example of a non-Markovian 
evolution consider the rule 

Xn — Xn~2 ~l~ (2) 

where ^„'s are again i.i.d random variables. This is just the discrete-time version of the continuous-time random 
acceleration problem: cPx/dt^ = £_{t) where £,{t) is a Gaussian white noise with zero mean (^(t)) = and delta 
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FIG. 1; A trajectory of a random walker starting at the initial position xq and evolving with the number of steps n. 



correlator {^{t)^{t')) — d{t — fV It turns out that the first-passage properties of even this simple non-Markovian 
evolution is highly nontrivial [26|428l|. We will not consider non-Markov evolution rules further in these lectures and 
focus only on the Markov evolution ([T]). For the first-passage properties of non-Markovian stochastic processes see 
[28j and references therein. 

Iterating the Markov evolution rule ([T]) up to n steps, it follows that the position Xn of the walker after n steps, 
starting at a:o — 0, is simply a sum of n i.i.d. random variables 

n 

Xn^^^k- (3) 
fc=l 

In the case when is finite, using the independence property of the step lengths ^fc's, it follows that the mean square 
displacement of the particle after n steps, for all n, is simply 

{xl)=na\ (4) 



Brov^rnian limit: At this point, it is useful to consider the continuous-time limit where the random walk reduces to 
a Brownian motion. Let us define At as a small time interval and set t = nAt. Then Q gives 

(^^> = ^i- (5) 

If one now takes the limit At — > 0, it follows that cr^ — > also in order that {x'^{t)) remains finite at finite time t. 
Thus, to have a meaningful continuous-time limit, the mean square step length — 2DAt as At — >■ with a finite 
diffusion constant D, leading to the diffusive law of Brownian motion {x'^{t)) — 2Dt for all t. In this continuous-time 
limit, one can also rewrite the Markov evolution rule ([IJ as 

^ = |i = ^(i) (6) 
At At ^ ^ ^ ' 

where ^(t) is random noise with zero mean that is uncorrelated at two different times, (C(*i)C(*2)) = for ti ^ t2- At 
the same time instant, however, (C^(t)) = a"^ /{At)"^ = 2D/ At. Thus, as At — >• 0, diverges. A useful physicist's 
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way of writing this correlation function of the noise is (C(ti)C(^2)) = 2DS{ti — t2)- In this hmit it is cahed the white 
noise and one writes ([5]) as a stochastic Langevin equation 

where ^(i) is the white noise with zero mean and a correlator (^(^1)^(^2)) = 2DS{ti — ^2)- Note that for all practical 
purposes, such as in numerical simulation, one will interpret the delta function as (5(0) = 1/At. 

We will see later that in the Brownian limit many properties of the walk, such as its first-passage probability, 
become much simpler. In contrast, for discrete time evolution, even though the process is Markov, some of these 
properties are quite nontrivial. 



B. Green's Function 



Let us get back to our basic discrete-time Markov evolution ([T]). In this subsection, let us compute a basic object 
namely the free (bare) Green's function G{x,xo,n) defined as the probability density of the position of the walker 
after step n at x, given that it started from xq at step 0. Using the Markov property, one can easily write down a 
recursion relation for the evolution of G{x, xq, n) 



G{x,xo,n) 



G{x' , XQ,n — 1) (j){x — x') dx' 



(8) 



which counts the event of particle jumping from its position x' at step n — 1 to its position x at step n by an amount 



{X 



drawn from the distribution 



This is called the forward Kolmogorov equation, since one considers 



the current position x of the walker as a variable. Alternatively, one can also write down a backward Kolmogorov 
equation where one considers the starting position of the walker xo as a variable 



G{x,xo,n) 



G{x, Xq^u — 1) (I){x'q — Xo) dxQ. 



(9) 



Here one considers the displacement of the particle at the first step from xq to x'q and for the subsequent evolution 
up to {n — 1) steps the starting position of the walker is at x'q. Both equations are completely equivalent to each 
other. We will see later, however, that for certain first-passage related quantities, the backward equation is often 
computationally more advantageous than the forward one. 

These integral equations © or © can be easily solved using Fourier transforms. For example, for the forward 
equation, we define 



G(fc,xo,n) 



G{x, Xo, n) e^^^ dx 



(10) 



and use the convolution form of ([8]) to get G{k, xo, n) — G{k, xo, n — 1) 4>{k) where 4>{k) is the Fourier transform of 



(j}{x). Iterating n times and using the initial condition, G{x, xo, 0) — 5{x — xo) and hence G(fc, xo, 0) 
r ~ 1 ^ - 

G{k, xo,n) — 4>{k) e^^^" . Inverting the Fourier transform one obtains the exact Green's function 



G{x,xo,n) 



-ik{x—XQ) 



dk 
2^ 



" one gets 



(11) 



Let us now see what happens for large n. In cases where is finite, one has, for small k, (j){k) w 1 — + 0{k'*'). 

Now, for large n, the dominant contribution to the integral in (jlip comes from small k region. Substituting the small k 
behavior, exponentiating and performing the Gaussian integral, one gets, for large n, the standard Gaussian behavior 



G{x,Xo,n) 



exp 



(x - Xo^ 
2na^ 



(12) 



which is essentially the statement of the central limit theorem (CLT). Note that the universal Gaussian form holds 
only near the central peak but not at the tails which are described by nonuniversal large deviation function that I 
will not discuss here '5j. On the other hand, for jump distributions with a divergent cr^ (such as for Levy flights in 
example (iv)), the CLT breaks down [iBl- For Levy flights, one can write for small fc, 4>{k) sa 1 — |afc|^ where 
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< < 2 is the Levy index and a is a microscopic length. Substituting this in (fill) and rescaUng n{ak)'^ one 
gets, for large n, 

an^/i-'- \ an 

where the function 



G(->-o,n)^— ^) (13) 



$,(.)= r eH'^l''-'^^^ (14) 

is called the Levy stable function of index /i d, [T5l - [l7| . Note that this function $^(z), for large z, has the same 
power law tail, ^fi{z) ~ |z|~^~'' as the jump distribution itself. For some special values ofu, one can compute this 
function explicitly Thus, the result in (|13l) is the statement of Levy stable law jisl. [l6|: the sum of i.i.d Levy 
distributed variables is itself Levy distributed (up to a rescaling by n^/^'), i.e., the Levy distribution is stable under 
addition This is thus the counterpart of the CLT which is the analogous statement for the sum of i.i.d random 

variables with a finite cr^: the stable law for CLT is Gaussian. Note that from (IT3|) it follows that the typical distance 
traversed by the particle in step n scales super-difFusively: x ~ n^^^ for < /i < 2. 

Brownian limit: In the continuous-time limit, when is finite and hence the CLT holds, the integral equations 
([5]) or ([5]) reduce to partial differential equations. For example, given the Langevin evolution in ([7]), the forward 
Kolmogorov equation ^ reduces to 

/oo 
Gix~m^t,xo,t)(l){m)dm (15) 
-oo 

Expanding the Green's function on the rhs in a Taylor series, keeping terms up to 0((Ai)^) and using the property 
that = 0(C) = 2D/At, one gets, taking At 0, the well known diffusion equation for the Green's 

function 



dG _j^d^ 
dt dx 



D— (16) 



starting from the initial condition, G{x, xq, 0) = 6{x — xo). Similarly one can write down a backward diffusion equation 
with x in (jl6p replaced by xq. The solution of the forward (or the backward) diffusion equation can be easily found 
using Fourier transforms and one recovers, as expected, the Gaussian behavior 



G{x,xo,t) ^ J—, exp 



[x ~ xo)^ 



ADt 



(17) 



For Levy flights, where ct^ is infinite and the CLT breaks down, one can still formally define a continuous-time 
limit, and obtain the so called Levy fractional diffusion equation (for a review and discussion, see [is}). This simply 
follows by rewriting the basic recursion relation ([5]) as 



G{x,xo,n):^ / G{x ~ £,n,xo,n- l)(j){^n)d^n- (18) 

J —OO 

Next we write G{x — ^n, ^o, n — 1) = G{k, xo,n — 1) e*^*^^^^^^ dk and substitute it in (fTS)) . This gives 

rOO 

G{x,xo,n)= dkG(k,xo,n-l)4>{k). (19) 



Following similar arguments as in the Brownian case, in the large n limit, one needs to keep only the small k 
contribution of (f>{k) = 1 — \ak\^ in p^ . This gives 

/oo 
dk\k\''G{k,XQ,n-l). (20) 
-oo 

Now, we need to divide both sides by the time increment At and take the limit At — 0. To obtain a sensible limit, one 
needs to take o — >■ limit as well, keeping the ratio / At = K fixed. This gives a continuous-time integro-differential 
equation 



^OO 

— = -K dk G{k, xo,n- 1) = -K (-a^)"^ G (21) 

J — oo 
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where the integral in the k space can be formally interpreted as a fractional derivative. Note that for /i = 2, one 
recovers the standard diffusion equation. But for < < 2, one still needs to solve an integral equation even in the 
continuous-time limit. Thus for the Levy walks, even though one can formally write down a continuous-time equation, 
it is not as useful as the ordinary Brownian case where one has a true differential equation in real space whose solution 
can be easily obtained. This continuous-time fractional diffusion equation has been studied extensively in the recent 



past (for a review see [18 
derived (see for instance 
discrete-time evolution. 



and many interesting results, in particular concerning first-passage properties have been 
29|, [3^). However, in these lectures I will not use this approach and will rather stick to the 



III. RANDOM WALKS: SURVIVAL AND FIRST-PASSAGE 



Having done with these standard basic preliminaries, let us now turn to the first-passage properties of a random 
walk evolving in discrete time via the Markov rule ([Ij with arbitrary symmetric jump length distribution </>(^). We 
first define the restricted Green's function G~^{x, Xq, n) as the probability (density) that the walker, starting at xq > 
at step 0, reaches the position a; > at step n but without crossing the origin in between, i.e., it stays positive at all 
intermediate steps and lands at x exactly at the n-th step 

G'^{x, xo,n) = Prob [a;„ = x, Xn-i > 0, Xn-2 > 0, . . . , xi > 0|xo] . (22) 

Using the Markov property of the evolution, one can again write down the evolution equation for the restricted Green's 
function, both forward and backward as in case of free Green's function in the previous subsection 

l)(t){x~ x')dx'; (forward) (23) 

1) (I){x'q — xq) dxQ-, (backward) (24) 

The interpretation is as before. For example, in the forward case, one considers the walker reaching x' at step (n — 1) 
(staying positive always) and then making a final jump x' — >■ x at step n by drawing a random length x — x' from 
the distribution (/'(^). Similarly, in the backward equation, the particle at step 1 jumps from its initial position xq to 
a new position x'q and subsequently evolves for (n — 1) steps starting from this new initial position Xq while staying 
positive all along. One then integrates over all possible jumps at the first step, but making sure that x'q is positive. 

The survival probability or the persistence is defined as the probability Q{xq, n) that the particle survives (i.e. stays 
positive) up to step n, no matter what the final position x at step n is. Thus 

Survival Probability : Q{xo, n) ~ Prob [x„ > 0, Xn~i > 0, Xn-2 > 0, . . . , xi > 0\xq] — / G~^{x, xq, n) dx (25) 

^0 

Thus, one can either solve first the forward equation obtain the restricted Green's function G'^{x,xo,n) for all 
X and then integrate over x in (|25p to obtain the survival probability Q{xo,n). Alternatively, and in a much easier 
way, one can integrate the backward equation (|24p over x and write directly a backward evolution equation for the 
survival probability itself 

roo 

Q{xo,n)^ / Q{xQ,n-l)(l){xQ- xo)dx'Q. (26) 

JQ 

Thus one saves an extra integration step (|25p and just needs to solve only the integral equation (j26p starting from 
the initial condition Q{xo, 0) = 1 for all xq > 0. This initial condition follows from the fact that the walker definitely 
(with probability 1) does not cross in step. One thus sees why the backward equation is more advantageous 
compared to the forward equation, atleast as far as the persistence properties are concerned. 

Once we have obtained the survival probability Q{xo,n), the first-passage probability can be easily computed from 
it. The first-passage probability F{xo,n) is defined as the probability that the walker, starting initially at xq, crosses 
the origin for the first time immediately after step n — 1 (i.e., it is positive at step n — 1, but becomes negative at 
step n). It then follows that 

F{xQ,n) = Q{xo,n- 1) - Q{xQ,n) (27) 
as it counts the fraction of paths that survived up to step (n — 1), but not up to step n. 



G {x,xo,n) = / G {x,xo,n — 

Jo 

1*00 

G^{x,xo,n) ~ / G^{x,XQ,n — 
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So, to compute the first-passage or the survival probability, we need to solve the integral equations (|23l) . ([24)) or just 
directly ([^5]) . Note the important differences in these equations compared to the free Green's functions in ([5]) and © : 
they look almost similar, but not quite. In equations ([23]) . ([24]) or ([26|l . the limit of integration on the rhs is from 
to oo, as opposed to — oo to oo in the free Green's function equations © and This makes a huge difference! The 
reason is, even though (j26p apparently seems to have a convolution form, the limit of integration is only over half-space 
[0,cxd] and not the full space [—00,00]. If the limits were over the full space, as in the case of free Green's functions, 
one can simply use the Fourier transform methods. But for the half-space problem, unfortunately one can not use 
simple Fourier transform technique. In fact, such half-space integral equations have been well studied in mathematics 
and are known as Wiener-Hopf integral equations [3l|. For a general kernel (j){x ~ x'), they are notoriously difficult 
to solve! However, for the particular case where the kernel ^{x — x') has the interpretation of a probability density 
function (i.e., non- negative and normalizable function), one can obtain explicit solution (23| (as discussed later). 

The discussion above makes it clear the technical reason as to why computing the first-passage properties of even a 
simple random walker (but with arbitrary jump distribution 0(^)) is nontrivial. Before we write the solution explicitly, 
let us see first how this problem simplifies in the continuous-time Brownian limit. 

Brownian limit: In the Brownian limit, one can reduce the discrete time backward integral equation (|26|) for the 
survival probability into a partial differential equation. Let us consider the survival probability Q{xo,t + At) up to 
time t + lS.t. Let us break the interval [0, < -I- At] into two intervals [0, At] and [At, t + At]. In the first small interval 
the particle evolves from its initial position xq to a new random position xq + ^(0)At where ^(0) is the initial noise in 
the Langevin equation ([7]). Subsequently the particle evolves in the interval [At,t 4- At] starting from its new initial 
position xo -I- ^(0)At. Thus, the analogue of (^51) is 

/>oo 

g(xo,t-HAt)= / Q(xo + e(0)At,t) 0(^(0)) d(e(0)) (28) 
Jo 

Expanding in a Taylor series as in the case of free Green's function and using the properties of the white noise, one 
then gets the backward Fokker-Planck equation for the survival probability 



dt dxQ 



- (29) 



2 ' 2z— = 0. (30) 



vahd for all > and to be solved with the boundary conditions: (a) Q{xo = 0,t) =0 for all t and (b) Q{xq — 
00, t) = 1 for all t and subject to the initial condition Q{xq,Q) — 1 for all xq > 0. Thus in the Brownian limit, we 
are able to reduce the Wiener-Hopf integral equation into a partial differential equation (PDF): that's already a big 
simplification! 

The solution to this PDF can be obtained by various standard methods. Let me just mention here a slightly non- 
standard but quick method. Clearly, using the diffusive scaling x ~ t^/^, it follows that the function Q{xo,t) must 

have a scaling form: Q{xo,t) ~ U i^'^^^ij- Substituting this scaling form in the PDF (l29l) one obtains an ordinary 

differential equation (that's what scaling always does: reduces a function of two variables into a function of a single 
scaled variable) valid for 2: > 

— 2z — 

dz^ dz 

The initial and the boundary conditions of the PDF translates into two boundary conditions: C/(0) = and U{z 
00) = 1. The solution can be easily obtained: U{z) ~ erf(z) — du. Thus we get the explicit well known 

solution [i,[2l| 

g(xo,t)=erf f-^) (31) 

Note that even though we had assumed scaling (without really proving it!), the solution (|3T|) is exact for all t as 
one can directly verify by substituting it in the PDF (|29p . One also sees that for large t and fixed xq the survival 
probability decays as a power law 

(32) 

The first-passage probability is given by F{xo,t) — which is just the continuous-time limit of (|27| . Using ((3T|) 
one then gets 
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which decays, for large t and fixed ccq, as i with the famous first-passage exponent 3/2 [1, d, HH 



A. Pollaczek-Spitzer formula and Sparre Andersen Theorem 



Let us now go back to the basic Wiener-Hopf integral equation ((26)) that describes the evolution of the survival 
probability Q{xQ,n). As mentioned before, the solution is nontrivial for a general kernel 0(a; — x'). However, when 
the cumulative distribution = J-oo 't'i.O^^ ^ continuous function such as in examples (i)-(iv) in Section-I 

(but not for lattice random walk (v) where ^'(x) is a discontinuous function), an explicit solution was first found by 
PoUaczek [22] and later independently by Spitzer [23] in a slightly different context. PoUaczek was interested in finding 
the distribution of the ordered partial sums of a set of i.i.d. variables, whereas Spitzer was interested in finding the 
distribution of the maximum of the set of partial sums, which is related (see later) to the survival probability. Spitzer's 
derivation was more combinatorial. The same integral equation also appeared previously in a variety of half-space 
transport problems in physics and astrophysics (see [s^ and references therein) and several other derivations of the 
solution of this equation, mostly algebraic in nature, are known [3^ . Unfortunately, all these derivations, both the 
combinatorial as well as the algebraic ones, are highly technical in nature and there is no easy way! Here I will avoid 
these technical steps and instead just state the final result and discuss its applications. Readers who are interested in 
the algebraic derivation may consult ^sS] where we have listed systematically the steps that lead to the final solution. 

The solution of ([25]) . with the initial condition Q{xq,0) — 1 for all xq > 0, is in terms of a double Laplace transform 
of Q{xo,n) 



^Q(a;o,") s" 



n=0 



e P'^" dxo = — exp 



P 



In 1 



(i - sm) 



dk 



(34) 



where (j){k) — 'I'i.O s*'^^ the Fourier transform of the jump length distribution. We will refer this solution in 
(IMl) as the Pollaczek-Spitzer formula. 
Let us now discuss some consequences of this explicit result. 

B. Sparre Andersen Theorem 

Although the survival probability Q{xo, n) for arbitrary xq depends explicitly on the jump length distribution </)(^) 
as evident in ([34]), it turns out that Q{0,n) (the survival probability of the particle up to n steps starting at the 
origin) becomes, somewhat miraculously, independent of the distribution (f>{S,) as long as it is a continuous function. 
To see this, let us take the p — >■ cxd limit in (l34t . Making a change of variable pxQ = y on the Ihs of (p4)) and taking 
p — 00 limit, the Ihs reduces, to leading order, to ^ X^J^o the rhs, taking p — > oo limit gives ■ 

Equating the leading order terms (of 0{l/p) for large p) on both sides gives the identity, for all s, 

5]Q(0,n)s" = ^_. (35) 
Vl — s 



n=0 

Equating powers of s one gets the Sparre Andersen theorem [2; 



q(n) = Q(0,n)= P''')2-2" (36) 



n 



where we have used, for convenience, a shorthand notation q{n) for Q(0,n). Thus, quite amazingly, the survival 
probability q{n) — Q{0,n) (starting from the origin) is completely universal and that too for all n (and not just 
for large n). No matter whether the jump length distribution is exponential, Gaussian or uniform, q{n) is the same 
and is given by the simple formula in (j36]). Sparre Andersen derived this formula originally using rather involved 
combinatorial approach. This simple looking formula is however a bit deceptive and led several authors to try to 
derive it in a 'simple' way! Unfortunately, all attempts led to equally complicated derivation (see ^3] and references 
therein). Deriving this formula as a special case of the Pollaczek-Spitzer solution is instructive as it shows that the 
role of the starting point = is important for this universality. One looses this universality the moment xq is 
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Let us also note another interesting fact. In the hmit of large n, the survival probability q{n) in ([36]) decays, to 
leading order, as 



qin) = Q{0,n) 



1 



(37) 



Let us emphasise again that this result holds for arbitrary continuous jump distribution 0(^) including even the 
Levy flights! One may 'naively' remark that this n^^^^ asymptotic decay is equivalent to the decay of the 

survival probability in the Brownian limit derived in p2l) . However, this is not correct and is actually rather subtle 
as was shown in [33j. Consider first a continuous and symmetric jump distribution with a finite second moment 
= /.^ '/'(O (^C- To derive the Brownian limit from the PoUaczek-Spitzer formula (p4|) . one first considers the 
scaling limit Xq ^ oo and n — > c» but keeping the ratio xq / ^/n fixed. A careful asymptotic analysis of (1341) shows 
that in this limit the first two leading terms for large n are given by [33j 



Q{xo, n) ~ erf 



1 



-xl/2a'^n 



(38) 



If one now takes the Xq « \Jn limit, one recovers the universal Sparre Andersen result in ([57]) from the second 
term on the rhs of Eq. p8)) . On the other hand, if one keeps the scaling ratio xo/^/n fixed and takes the strict 
n — > oo limit, the second term in (j38l) becomes subleading and the first term on the rhs (which remains nonuniversal 
in this limit as it contains explicitly) becomes the leading term that provides the Brownian result in (1311) upon 
identifying a^n — 2Dt. Thus the n^-^^^ universal decay of the survival probability (for xq = 0) is not quite related to 
the Brownian result t^^^^: they originate from two different terms in (|38p . 

Generalization to asymmetric jump distribution: Actually there exists a generalized Sparre Andersen theo- 
rem [20| which holds for non-symmetric (but still continuous) jump length distribution 0(^). Unlike in the symmetric 
case, for asymmetric jump distribution of a random walk starting at xq = 0, the probability that the walker is on the 
positive side up to n steps is different from the probability that it is on the negative side up to n steps. Thus one 
needs to define two different survival probabilities 



q+{n) — Prob[a;„ > 0, x- 
q-{n) ~ Prob[a;„ < 0, x- 



For symmetric jump distribution q+{n) = Q-in) 
theorem reads 



-i>0,...,xi>0|xo = 0] (39) 
-1 <0,...,xi <0|xo = 0] (40) 

q{n). In the asymmetric case, the generalized Sparre Andersen 



(41) 





oo 
n=0 


^ n 

_n— 1 




oo 

= ^Q- ("-) = exp 

n=0 


oo 

^ n 

_n—l 



(42) 



where p+ = Prob(a;„ > 0) = G{x, Xo,n) dx and ~ Prob(a;„ < 0) = G'(x, Xq, n) are just the probabilities 
that exactly at the n-th step the particle position is positive and negative respectively. For the symmetric (zero bias) 
case, p+ —Vn — 1/2 (by symmetry) and then both equations (|4T1) and (|42l) reduce to (|35l) . 

Let us mention here a special case with drift, noted by Le Doussal and Wiese [t^, that is explicitly solvable and 
that gives rise to a power law decay of the survival probability with a continuously dependent exponent. Consider 
the evolution, 



with Xq — 

distribution 



Xn = Xn-1 ^ [i. + i-n. (43) 

0. Here /i represents a drift and ^„'s are i.i.d noise variables each drawn from a symmetric Cauchy 



m 



7r(^2 _|_ ^2) 



(44) 



In this case, the variable y„ = a;„ — /in undergoes a symmetric random walk, y„ = j/n-i + ^n- Hence, the probability 
distribution of ?/„ at step n, starting from j/o — 0, can be easily computed from the free Green's function discussed in 
Section I. In fact, the Cauchy distribution corresponds to the Levy laws in (jl3p with index /i = 1. Hence, 



G(2/,0,n) 



an \an/ 



7r(2/2 -|- a^n?) 



(45) 



10 



Thus 

p+ = Prob(a;„ > 0) = Prob(y„ > 
p~ = Prob(a;„ < 0) = Prob(j/„ < 
Substituting these results in (|4T|) and (|42]) one gets 

g±(s) = ^g±Hs"----^; C± = 2±-tan-nA'/a)- (48) 

n=0 ^ ' 

Inverting the generating function one then finds that for large n 

q±{n)^^^-^; 0±(a*) = 1 " C± = ^ T ^ tan-i(M/a). (49) 

Thus the persistence exponents 0±{p) are nontrivial and vary continuously with the drift /i. For example, as /i — > cxd 
(drift away from the origin), 0+ — )■ (the particle always remains positive) and as — — oo (drift towards the origin), 
leading to a faster decay than the driftless {fi = 0) case where 0±{0) = 1/2. 

IV. FIRST APPLICATION: STATISTICS OF THE MAXIMUM OF THE WALK 

The study of the statistics of the maximum of a set of i.i.d. random variables goes back a long way and the subject 
is called Extreme Value Statistics (EVS) [35|. The results are well established and have found a lot of applications 
in a wide variety of fields ^5|. However, the standard EVS, developed for i.i.d. variables, does not apply when the 
random variables are correlated. Recently there has been growing interests in the statistics of the maximum of a set 
of correlated random variables [36j . The random walk model discussed in this article presents a solvable example of 
the statistics of maximum of a set of strongly correlated variables. 

More precisely, let us consider again the sequence JT]) starting from xo = and the successive noise variables ^^'s 
are as usual i.i.d variables each drawn from a symmetric and continuous 4>{^). Let us define the global maximum of 
the walk up to n steps 

M„ = max(0, xi,a;2, . . . ,x„). (50) 

Clearly M„ is a random variable taking different values for different realizations of the walk and we would like to 
compute the distribution of M„. Note that even though the noise variables ^^'s are uncorrelated, the position of the 
walker x^'s are correlated. For example, when cr^ — (^^) is finite, it is easy to see from ([ij that 

(xm^n) = (7^ min(r7i, n) (51) 

Thus, this is clearly an example where one is trying to compute the distribution of a set of correlated random variables. 

The distribution of M„, as we will see now, is actually closely related to the survival probability Q{xo, n) discussed 
in the previous section. To establish this connection, let us first define the cumulative distribution Prob(M„ < y). 
This is just the probability that the walk, starting at xq = at step 0, stays below the level x = ?/ up to step n, i.e., 

Prob(Af„ <y)= Prob [xi < y, X2 < y, . . . , x„ < y] . (52) 

Let us make a shift and define Zk — y — Xk- Then, z^'s evolve via the same Markov rule ([T]), but starting from the 
initial position zq = y (since xq =0). Thus (|52|) reduces to 

Prob(M„ < y) = Prob [zi > 0, za > 0, . . . , z„ > 0|zo = y] = Qiv, n) (53) 

where Q{y, n) is precisely the survival probability of the walk up to n steps, starting at y. The solution of Q{y, n) is 
given by the Pollaczek-Spitzer formula (j34p for arbitrary continuous distribution 0(f). 



-fin) 



fi n 



7T{y^ + a^rfi) 
an 



dy 



■tan ^{fj,/a) 



(46) 
(47) 
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A. Expected Mcixiinum 



The exact solution for Q{y, n) in (p4)) thus also provides an exact solution (or rather the double Laplace transform) of 
the probability distribution of the maximum, at least in principle. In practice however, the extraction of the moments 
of the maximum from this explicit Pollaczek-Spitzer formula ([M)) turns out to be rather nontrivial. For instance, 
even the first moment, i.e., the expected maximum E[Mn] is hard to extract for all n and arbitrary continuous noise 
distribution 0(^). This question first arose in the context of a packing; problem in two dimensions where n rectangles 
of variable sizes are packed in a semi-infinite strip of width one [stI . ISq . It was shown in Ref . fSS] that for the special 
case of the uniform jump distribution, = 1/2 for — 1 < ^ < 1 and — outside, for large n, 



E[Mn] 



'—^ - 0.297952- ••-I- 0(n-i/2) 

OTT 



(54) 



The leading ^/n behavior is easy to understand and can be derived from the corresponding behavior of a continuous- 
time Brownian motion after a suitable rescaling [38]. However, the leading finite-size correction term turns out to 
be a nontrivial constant — c with c = 0.29795219028 . . . that was computed in Ref. 38] by enumerating an intricate 
double series obtained after a lengthy calculation by a different method. It is important to compute the leading finite 
size correction term very precisely as it provides a sharper estimate of the efhciency of rectangle packing algorithms 
studied in computer science .3^ 

Recently, we were able to show [ij] , starting from the Pollaczek-Spitzer formula , that for arbitrary continuous 
and symmetric jump distribution with a finite second moment ct^ = I-oo '?^(^) '^^^ expected maximum has 
a similar asymptotic behavior as in the uniform case, namely. 



E[Mn] = cr 



-C+ 0(71-1/2) 



Moreover, an exact expression for the constant c was found (l3 | 



dk 



In 



1 - m 



(55) 



(56) 



where (f>{k) is the Fourier transform of 4>{S,). 
<f){k) = sin(fc)/fc and ((55|l gives 



(72fc2/2 

In particular, for the uniform distribution (example (iii)), one has 



°° dk , 



sin k 



0.29795219028. 



(57) 



The extraction of the constant correction term (j56p explicitly from (j34p turned out to be highly nontrivial and required 
a certain number of delicate mathematical manipulations |14| . Interestingly, the same constant c also appears in an 
apparently different problem when one tries to compute the average flux to a spherical trap in 3-dimensions of particles 
undergoing Rayleigh flights j39| . The origin of this connection has now been understood-both problems are effectively 
described by exactly the same Wiener- Hopf integral equation, albeit with two different initial conditions I33l. M any 
other interesting nontrivial exact results for this spherical trap problem have been recently computed in |33l. l40l. l4l| . 

For the jump distributions where cr^ jg infinite, as in the case of Levy flights, a similar formula for the expected 
maximum can be derived [ij] from the Pollaczek-Spitzer formula. For example, for ^(fc) = 1 — \ak\^ + O(k^) (for 
small k and with 1 < < 2) , the expected maximum is given by ^14| 



E{Mn) _ M 



1 



= ^r 1-- ni/^+7 + 0(n 



where the constant 



7 = 



dz 



In 



For example, for (j){k) — exp[— jafc]^] with 1 < /i < 2, one obtains [3] 



7 = - 





ri-e-^"i 








k^^ 



C(i^) 



(27r)VA' sin(7r/2/x)' 



(58) 



(59) 



(60) 
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Note that for < fi < 1, the expected maximum is strictly infinite. 

We close this subsection by just pointing out another completely different problem where the expected maximum 
of a random walk plays an important role. Recently we showed that the expected perimeter (L„) of the convex hull 
of a 2-dimensional random walk of n steps is exactly equal (up to a factor 2tt) to the expected maximum of the 
x-components of this 2-d random walk: — 27r(M„) where M„ = max(0, xi, 0:2, . . . ,Xn) (42, 43]. This connection 
allowed us to obtain a number of exact results for the statistics of the convex hulls of random walks in two dimensions. 
We do not discuss this problem in detail here, but refer the interested readers to [12, for details. 



B. Time at which the Random Walker's Trajectory Achieves its Maximum 

In the previous subsection we discussed the statistics of the maximum M„ of an n-step walker. Another interesting 
question is the following: given an n-step walker that started at the origin at step 0, at which step m does the 
maximum M„ happen? In other words, at which time step the n-step walker is farthest (in the positive direction) 
from the origin. This time step m of the occurrence of the maximum is itself a random variable. It turns out that the 
probability distribution of this time step P{m\n) (given the total number of steps n and that xq — 0) is also closely 
related to the survival probability (5(0, n) discussed above. 

Before we discuss this, let us remark that for a continuous-time Brownian motion ^ of total duration t and starting 
at the origin, the analogous probability density P{tm\t) of the time tm at which the Brownian motion is maximally 



away from the origin in the positive direction was computed by Levy 44 1 



Pitm\t)^ — ; 0<t™<t (61) 

known as the celebrated Levy's arcsine law. The name 'arcsine' is due to the fact that the cumulative distribution of 
tm has the arcsine form: Prob(<m ^ zt) = ^ arcsin(-yi) for < z < 1. Thus the maximum is more likely to occur 
at the begining tm = Q ov at the end tm = t oi the time window, a fact slightly counterintuitive given that the walk 
is symmetric around 0. Note that Levy's arcsine law also appears in the distribution of the occupation time of a 
Brownian motion Let t+ = 9{x{t)) dr be the time spent by a Brownian motion of total duration t on the 

positive side of the origin. Then the probability density function of t+ has exactly the same form as in (|6T|) 



P{t+\t) = ^= ] ■ 0<t+<t. (62) 

This result looks rather simple, but again is nontrivial to derive. For a derivation using Feynman-Kac path integral 
technique, see [24]. 

The two random variables tm and t+ represent two rather different observables even though they share the same 
probability distribution. The derivation in the two cases are also quite different. In mathematical terms, one would 
say that tm = ^+ where = means that these two random variables have the same statistical law. For the Brownian 
motion, one can prove this equivalence in law directly without actually deriving the distribution separately in 
each case. In fact, this equivalence between tm and t+ holds for many other Markov processes as well 

Coming back to the random variable tm of our interest, we note that the distribution of tm has rather different 
shapes if one puts various constraints on the Brownian motion. For example, in case of a Brownian bridge i.e. a 
Brownian motion conditioned to be at x(0) = and x{t) = 0, the probability density of tm. is known to be uniform Q 

P{tm\t) = j; 0<tm<t. (63) 

Recently, using path integral methods, this distribution P{tm\t) was computed for a variety of other constrained 
Brownian motions, such as Brownian excursion, Brownian meander, reflected Brownian bridge etc. [45l - l48| . Interest- 
ingly, P{tm = x\t = L) is also precisely the disorder-averaged equilibrium probability density of a particle, moving 
in an external disordered potential in one dimension, at position a; in a box of size L [49.] . Some of these results 
have been recently rederived by a functional renormalization group method (50| . In addition, in the context of the 
convex hull of Brownian motion in 2-dimensions, it turns out that to compute the mean area of the convex hull of a 
2-d Brownian motion, one needs to compute the distribution P{tm\t) of the corresponding one dimensional Brownian 



motion [42|, |43|. Very recently, the distribution P{tm\t) has been computed exactly [5l| for the random acceleration 
process (the continuous-time version of the non-Markov evolution rule in ([2])). This, to my knowledge, is perhaps the 
first exact result on P{tm\t) for a non-Markov process. 

The analogous distribution P{m\n) for the discrete-time random walk process in ([T|) for arbitrary continuous and 
symmetric jump length distribution (/)(^) can be computed exactly from the knowledge of the survival probability 
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q{n) = (3(0, n). To see this, consider Fig. [2]Lct us just invert this figure and look at the trajectory from the position 
Mn, i.e., make a change of variable: Zk — Mn — Xk- Next we decompose the trajectory into two parts: the left side 
for time steps between and m and the right side for time steps between m and n. Using the Markov property, these 
two parts are independent of each other. In the inverted picture, for the left side, let us also invert the 'time', i.e., 
propagate backwards. One has to thus consider all Zk paths that start at 2; = and stays positive up to m steps 
(which is equivalent to saying that Xfe's stays below M„). Note that finally we have to integrate over all possible M„ 
which means in the inverted picture the final value of Zm is integrated over. Thus the contribution from this left part 
is just q{m) = Q(0, m). A similar reasoning shows that the contribution from the right part is q{n — m) = (5(0, n — m). 
Multiplying one gets, upon using the Sparre Andersen result p6p . 

P(m\n) ^ q(m)q(n - m) ^ { ]{, / 2"2n_ 64) 

\m J \(n — m) J 

One can check easily that this distribution is normalized to unity: X]m=o ^("'•l^) ^ ^- Amazingly, thanks to the 
Sparre Andersen result, the distribution P(m\n) is again universal for all m and n, i.e., independent of the jump 
length distribution as long as it is continuous. Thus, it is given by the same formula (j64p for Gaussian, uniform or 
even for Levy flights! 

In the limit of large m and n (keeping the ratio m/n = x fixed), one gets 

P{m\n) c / (65) 

7r-\/r7i(n — m) 



which, once again, may naively look like the arcsine law for the Brownian motion (|6ip. However, note that this 
asymptotic result in ()65p is valid even for Levy flights. This 'arcsine' looking law, valid for arbitrary distribution, is 
not quite the same as the 'arcsine' law in the Brownian limit for the same reason discussed before in the context of 
the survival probability. 
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V. 



SECOND APPLICATION: STATISTICS OF RECORDS 



In this section we will discuss another beautiful recent application of the Sparre Andersen theorem p6p that results 
in the universal statistics of records in a random walk sequence (including the Levy flights) (52j . Statistics of records 
forms an integral part of diverse fields including meteorology [H, , hydrology [55| , economics [s^ , sports [57l - [59| 
and entertainment industries among others. In popular media such as television or newspapers, one always hears and 
reads about record breaking events. It is no wonder that Guinness Book of Records has been a world's best-seller since 
1955. Understanding the statistics of records is particularly important in the context of current issues of climatology 
such as global warming. 

Consider any discrete time series {xq, xi, X2, ■ ■ ■ ,Xn} of n entries that may represent, e.g., the daily temperatures 
in a city or the stock prices of a company or the budgets of Hollywood films. A record happens at step i if the i-th 
entry Xi is bigger than all previous entries xo, xi, . . ., Xi-i. Statistical questions that naturally arise are: (a) how 
many records occur up to step n? (b) How long does a record survive? (c) what is the age of the longest surviving 
record? Answering these questions is the main goal of the theory of records. 

The mathematical theory of records has been studied for over 50 years [60l - [63| and the questions posed in the 
previous paragraph are well understood in the case when Xi's are i.i.d random variables. Recently, there has been a 
resurgence of interest in the record theory due to its multiple applications in diverse cornplex systems such as spin 
glasses f64|. adaptive processes [6^ and evolutionary models of biological populations fEd, [G^I and models of growing 
networks [69|. The results in the record theory of i.i.d variables have been rather useful in these different contexts. 
Recently, Krug has studied the record statistics when the entries have non-identical distributions but still retaining 
their independence [68|. However, in most realistic situations the entries of the time series are correlated. Very little 
seems to be known about the statistics of records for a correlated time series. Recently, we developed a general 
formalism (53 | to study the statistics of records in a random walk sequence evolving via (JJ with an arbitrary jump 
distribution (f>{^). We showed [s^ that for symmetric and continuous jump distributions, the statistics of records 
have universal properties as a consequence of the Sparre Andersen theorem discussed before. Below we discuss this 
formalism developed in [s^ in some details. 

To proceed, let us consider a realization of the sequence cc^'s in ([1} up to n steps. The discussion below is general 
and holds even for asymmetric jump distribution 0(^). Let R be the number of records in this realization. We use the 
convention that the first entry xq is counted as a record. Evidently R is an integer. Let h denote the time interval 
between the i-th and the {i + l)-th record. Thus, k is the age of the i-th record, i.e., it denotes the time up to which 
the i-th. record survives. We will use the shorthand notation / — {li,l2, ■ ■ ■ ,Ir} to denote the set of R successive 
intervals (see Fig. [3]). Note that the last record, i.e., the _R-th record still stays a record at the n-th step since there 
is no more record breaking events after it. Hence Ir (the last one in Fig. [3]) denotes the number of steps after the 
occurrence of the last record till the last step n. The main idea is to first calculate the joint probability distribution 

P ( I, R\n] of the ages / and the number R of records, given the length n of the sequence. 



To compute this joint distribution we need two quantities as inputs. First, let q^{l) denote the probability that a 
walk, starting initially at xq, stays below its starting position xq up to step I. Clearly q^{l) does not depend on the 
starting position xq due to translational invariance and one can just set xq = 0. Then (?_(/) is precisely the survival 
probability defined in (|40p whose generating function q~{s) is given by the generalized Sparre Andersen result in (j42p . 
Recall that for the symmetric case q-(l) — q+{l) — q{l) is universal and its generating function is given exactly in 



The second input is the first-passage probability f^{l) that the walker crosses its starting point xq for the first time 
between steps {I — 1) and I from below xq (see Fig. [3]). Once again, /_(/) does not depend on the starting point xq 
due to translational invariance and and one can set xq = 0. Setting xq — 0, it follows that /-(/) — q-{l — 1) — q-{l) 
whose generating function can be expressed in terms of that of q-{l) 



(135]) 




(66) 



OO 




(67) 



In the symmetric case, f+{l) — f-{l) — f{l) with a generating function 




— s 



(68) 
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FIG. 3: A realization of the random walk sequence {a;o = 0, xi,X2, ■ ■ ■ , a;„} of n steps with R records. Records are shown as big 
red dots. Note that a local maximum of the walk is not necessarily a record. The set {h, I2, ■ ■ ■ , Ir} denotes the time intervals 
between successive records. 



Armed with these two ingredients q-{l) and f~{l), we can then write down explicitly the joint distribution of the 
ages I and the number R of records 

p(r,R\n)=f^{h)f^{h)...f-{lR-i)q^{lR)S^u^^i^^^ (69) 

where we have used the Markov renewal property of random walks which dictates that the successive intervals are 
statistically independent, except for the global sum rule that the total interval length is n (see Fig. |3]) which is 
incorporated by the delta function. Note that since the i?-th record is the last one (i.e., no more records have 

happened after it), the interval to its right has distribution q^{l) rather than f-{l). One can check that P (j,,R\rij is 

normalized to unity when summed over / and R. 

Note that in the case of symmetric jump distribution, since q-{l) = q{l) and /-(/) = /(/) are universal due to the 

symmetric Sparre Andersen theorem, it follows that P ^Z, R\nj and all marginals of it are also universal. Below we 

will focus on the symmetric case only. 



A. Universal Distribution of the Number of Records up to step n 

Let us focus here on the case of symmetric jump distribution 0(^) = where is continuous. In this case 

we can replace q~{l) by q{l) and /-(/) by f{l) in the joint distribution (|69l) . Let us first compute the probability 

distribution of the number of records R, P{R\n) = J2r^ (^l,R\r?j. To perform this sum, it is easier to consider its 
generating function. Multiplying (|69p by s" and summing over I, one gets 



Y: PiR\n)s"^[fisr-'qis)^ ^' ^ (70) 



n=R-l 



where we have used the explicit expressions for q{s) and /(s) from Eqs. (l66l) and (|68)) . 

By expanding in powers of s and computing the coefficient of one gets the explicit result [s^ 

P(i?|n) ^ - ^ + A 2-2n+R-i (71) 



which is universal for all R and n. The moments of R are also naturally universal and can be computed for all n. For 
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example, the first three moments are 



(R) = (2n+l)p"^2-2" 



:)2 



(i?^) = 2n + 2- (R) 
(i?^> = -6n-6 + (7 + 4n)(i?). (72) 

In particular, for large n, the mean, variance and the skewness behave as 

2 

Mean : (i?) ~ —;=Vn 
Variance : (i?^) - (i?)^ ~ 2 ^1 - n 

Skewness : ^ " ' ~ — ^ — ^ 73 

((i?-(i?))^)'/' (2^-4)3/2 

In (52l |. these results were also verified numerically for different jump length distributions (uniform, Gaussian, Cauchy) 
all giving the same universal answer. 

The results in ((75)) suggest that there is only a single scale for the number of records R ^ n^^'^. This is confirmed 
by analysing the full distribution P(R\n) of R in (|7T|l in the limit of large n. One finds that P{R\n) actually has the 
following scaling form for large n |52l | 

P{R\n) ^ ^77 9 (^) ; 9{^) = ^ e--V4. (74) 



Thus the distribution is broad in the sense that the mean and the standard deviation measuring the fiuctuation 
around the mean, both scale as ^ n^^^. Also, the mode of this distribution, i.e., the most probable (typical) value of 
i? is at i? = 0. It is interesting to compare this result for the random walk sequence ([T]) with that of an uncorrelated 
i.i.d sequence where each entry Xi is a random variable drawn from some distribution p{x). In the latter case, it is 
well known [6l| that the distribution of the number of records P{R\n) does not depend on p{x), and for large n, it 
approaches a Gaussian, 



P{R\n) ~ ^ exp 



{R-logny 
2\ogn 



(75) 



with mean (R) = logn and the standard deviation Vlog n. This distribution has its peak at i? = logn, in stark 
contrast to the random walk case where the most probable value of R is zero. In addition, even the fiuctuations of 
R are small compared to the mean for large n, again in contrast to the random walk case where the fiuctuations are 
large ^ 0{y/n) for large n. Thus the effect of correlation in the random walk sequence manifests itself in a broad 
scaling distribution for the number of records. 



B. Universal Age Distribution of Records 

Since the mean number of records grows as (i?) ~ n^/^, it follows that the typical age of a record grows also as 
(l) ~ n/{R) ~ n^/^ for large n. However there are rare records that are not typical and their ages follow different 
statistics. For example, what is age distribution of the longest lasting and the shortest lasting records? These 
extreme statistics of ages can also be derived from the joint distribution in (j69l) and hence they are also universal and 
independent of (f>{^). 

Let us first consider the longest lasting record with age Imax = max(/i, ^2, • • • , ^i?)- It is easier to compute its 
cumulative distribution Y{l\n) = Prob[Zinax £ I] given n. Now, if Zmax < I, it follows that each of the intervals k < I 
for i = 1,2, . . . , R. Thus, we need to sum up (|69|) over all k^s and R such that k < I for each i. As usual it is easier 
to carry out this summation by considering the generating function and we get 
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One can extract, in principle, the distribution Y{l\n) from this general expression. In particular, the asymptotic large 
n behavior of the average (^max) — X^i^ifl ~ ^d"-)] '^^^ be extracted explicitly ^52|] 



(^max) — ci n; 



Cl 



dy log 



1 



2V^ 



r(-i/2,j/) 



0.626508 . 



(77) 



where r(— l/2,y) = dxx '^1'^ e ^ is the incomplete Gamma function. Thus, the age of the longest record n) is 

much large than the typical age ^/n) for large n. 

For the shortest lasting record l^iin = min(Zi, ^2, ...Z^), it is also useful to consider the cumulative distribution 
Z{l\n) = Prob[Z„iin > I] given n. This event is equivalent to having the lengths, k > I for alH = 1, 2, . . . R. Following 
similar procedure as in the case of the longest lasting record, one finds the generating function 



(78) 



One can then extract, in a similar way, the asymptotic large n behavior of (Zmin) ^ \fnpK [53 ]. Thus, the mean age 
of the shortest lasting record grows in a similar way as that of a typical record, i.e., as ^Jn, albeit with a smaller 
prefactor 1/V^ = 0.56419 . . . compared with .JVfl = 0.88623 . . .. 



C. Two Generalizations 



In the discussion above for the statistics of records, we had assumed that the jump length distribution (/)(^) is 
symmetric and continuous. However, the basic renewal equation (j69p is valid for continuous but asymmetric jump 
distribution as well. The only difference is that we have to use the appropriate expressions for /_(Z) and q-{l) from 
the generalized Sparre Andersen theorem. For example, the generating function for the distribution P(Tl\n) for the 
number of records up to step n is given by the asymmetric version of (|70l) 

£ P(i?|n)s" = [/_(.s)]^-i ~q-{^) = [1 - (1 - ~q-{s) (79) 

n=R-\ 

where q_-{s) is given by (j42|) . 

Indeed, for the special case of a random walk sequence in presence of a drift [i, and Cauchy distributed jumps as 
in one can obtain explicit results for P{B\n) in (175)) . Using one gets the exact generating function: 

q_(s) = (1 — s)~^- and substituting this in ((7^ gives 

E ^(^1") ^" = ^ (l-,)C- (80) 

from which it follows that the average number of records grows anomalously (i?) ^ v}~^^ ^ for large n. Using 
C+ — 1/2 + tan~^(^/a)/7r, one sees that as — >■ oo (positive drift away from the origin), ^+ "7> 1 and thus the average 
number of records grows linearly with the number of steps n, i.e., at every step a new record happens on an average. 
Of course, this is expected in presence of an infinite drift since the particle moves ballistically in the positive semi-axis. 
On the other hand, as — — oo, C+ — )■ indicating that the average number of records do not grow with n for large n. 
This is also expected since the particle mostly stays on the negative side of the origin when /i — ?> — oo and thus hardly 
ever makes a positive record. These results were then used to understand the anomalous avalanche size distribution 
in a model of a particle moving in a random potential 70] . 

Another interesting generalization of these results emerged from the following observation: it turns out that the 
constant ci — 0.626508 . . . that appears as the prefactor of the linear growth of the longest lasting record in (|77|) also 
appears in the excursion theory of Brownian motion |7lj. Let us consider a Brownian motion over a time interval 
[0,t] and consider the set of successive zero crossing intervals or excursions (see Fig. 2]). Let us denote the maximum 
excursion length up to time t by /max(0 

^max(i) = max(Ti,T2, • • • ,TAr,A(t)) (81) 

where A{t) denotes the length of the last interval before t (see Fig. S]). Let Q{€) = Prob[Zniax(i) = ^(*)] denote the 
probability the last incomplete excursion is the longest one. Then it turns out ]71] that (5(t) tends, for large i, to 
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FIG. 4: A trajectory of a Brownian motion over [0,t] with A'^ completed excursions of lengths [ti,T2, ■ ■ ■ ,tn] and the last 
incomplete excursion of length A{t). 

the same constant Q{t) ci = 0.626508 ... as in Eq. ([77| . We were able to understand recently why this same 
constant ci appears in apparently different observables namely (i) in the length of the longest lasting record and (ii) 
the probability Q{t) that the last excursion is the longest [T^l- This understanding led us to study the statistics of 
^max(i) and that of Q{t) for generic stochastic processes going beyond the simple Brownian motion [721. The statistics 
of lma-x.it) turns out to have interesting universal features that allowed us to distinguish between stochastic processes 
that are smooth (i.e. with a finite density of zero crossings) versus the ones that are rough (where the density of zero 
crossings is infinite as in the case of the Brownian motion) 72] . 



In summary, I have discussed the universal first-passage properties associated with a discrete-time random walk 
sequence consisting of n steps, where the walker starts at the origin xq = and at each step jumps by a random amount 
drawn independently at each step from a symmetric and continuous distribution The first-passage probability 

is universal, i.e., independent of the jump length distribution due to the Sparre Andersen theorem. We have then 
used the consequence of this result on the statistics of two extreme random variables: (i) the global maximum of the 
walk and the step at which it occurs and (ii) the number and ages of records. We have seen that the distribution of 
the time of the maximum as well as the record statistics become universal as a consequence of the Sparre Andersen 
theorem. 

The distribution of the value of the maximum, however, is non- universal and depends explicitly on <^(^). The 
random variables belonging to this sequence are correlated. For the distribution of the maximum, the standard 
EVS of i.i.d. random variables does not apply due to these correlations. The computation of the distribution of the 
maximum for this discrete-time sequence is thus nontrivial due to these correlations, even though in the corresponding 
continuous-time Brownian motion it is easy to compute. However, thanks to the PoUaczek-Spitzer formula, one knows, 
at least in principle, how to compute the generating function of this maximum distribution for arbitrary symmetric 
and continuous (/>(^). The leading large n behavior of the moments of the maximum can be extracted relatively easily 
from this explicit Pollaczek-Spitzer formula. However, extracting the subleading finite size correction term turns out 
to be much trickier. At least for the expected maximum, we have seen how to compute exactly the leading finite size 
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correction term, for the case when the jump distribution has a finite variance and also for the case of Levy flights 
with index 1 < /i < 2. These results are interesting because the expected maximum of a discrete-time random walk 
is exactly related to the perimeter of the convex hull of a planar random walk which has important applications in 
the estimation of home range of animals in ecology [l^, . 

It would also be interesting to compute the expected maximum and the distribution of the time of its occurrence in 
presence of a drift. In the Brownian limit, in presence of a drift, the distribution of the time at which the maximum 
occurs has been computed using a path integral method [47], with interesting applications in finance. However, for 
the discrete-time case, I am not aware of any result so far and it would be interesting to compute this distribution. 

There are interesting generalizations of the results presented here. For example, concerning the statistics of records, 
we have studied only the statistics of 'positive' records, i.e., when the value Xi of a record that occurs at step i is 
bigger than all previous values, given that the sequence started at Xq = 0. It would be interesting to investigate the 
statistics of the records of the absolute values of the sequence, i.e., of {0, |xi|, |a;2|, . . . , \xn\} which, to my knowledge, 
has not yet been studied [73j . 

As I already mentioned, the record statistics of this Markov sequence has been studied in presence of a constant drift 
with interesting applications in avalanche dynamics [7Q|. In particular, we have seen one case, namely the Cauchy 
distribution with drift, where the average number of records grows with the sequence size n anomalously with a 
nontrivial drift-dependent exponent [70!] . It would not be difficult to compute the distribution of the ages of records in 
this particular case. The study of the age distribution of records for arbitrary asymmetric jump distribution remains 
an open problem. 

Another interesting generalization is to consider the Markov sequence generated by the recursion: Xn — r Xn~i +£,n 
where < r < 1 is a parameter and ^„'s are, as before, symmetric i.i.d. noise variables. This is just a discrete-time 
analogue of the continous-time Orstein-Uhlenbeck (OU) process of a particle moving in a harmonic potential. This is 
seen by writing, Xn — Xn-i = ^(1 ^ ?')a^ri-i + which, in the continuous-time limit (alongwith r — > 1 limit), becomes 
the process OU process, dx/dt = —Xx + ^(t) where ^(i) is a zero mean Gaussian white noise. This discrete-time 
sequence has many applications, e.g., it appears in the context of the practical sampling of experimental data on 
the persistence of a stochastic process [74| - [77| and also in the simple system of a ball bouncing non-elastically on a 
noisy platform [t^. In the latter context, the parameter < r < 1 represents the coefficient of restitution of the 
collision of the ball with the platform [tI] and the Brownian limit r = 1 corresponds to elastic collision. The first- 
passage properties of this sequence for generic < r < 1 turns out to be highly nontrivial even for a Gaussian noise 



distribution |74| . Explicit exact result is known only for the exponential noise distribution ^TSj . While, for generic 
< r < 1, these fir st-p assage properties are nonuniversal, one recovers several interesting universal properties in the 
elastic limit r — >■ 1 [7^] . It would be interesting to study the statistics of the maximum and that of the records in this 
simple Markov sequence for arbitrary < r < 1 and arbitrary noise distribution. 

In conclusion, there are still many unresolved questions associated with even simple one dimensional random walks. 
Depending on the new applications, new questions emerge requiring new techniques to solve them which are often 
nontrivial and interesting. 
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